A Mandarin to Taiwanese Min Nan Machine Translation System with Speech Synthesis of Taiwanese Min Nan

نویسندگان

  • Chuan-Jie Lin
  • Hsin-Hsi Chen
چکیده

This paper presents a design of a Mandarin to Taiwanese Min Nan (abbreviated as Taiwanese hereafter) machine translation system. It is the first machine translation system which focuses on these two languages. An input Mandarin sentence is segmented, tagged and translated word by word according to the part of speech of each word. The candidates come from a Mandarin-Taiwanese dictionary. If more than one candidate exists, an example base is consulted. When a Mandarin word is not found in the Mandarin-Taiwanese dictionary, it is translated according to a SingleCharacter dictionary. The output can be in terms of either speech or text. For speech output, we also deal with the tone sandhi problem in changing the tone of each Taiwanese syllable. Because the mapping between Taiwanese syllables and Chinese characters is still a subject of disagreement, and the phonetic spelling coding systems are not familiar to everybody, speech output is useful but is also a challenge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA)

In this paper, we describe the first Mandarin/Taiwanese (Min-nan) bi-lingual, continuous speech recognition system for large vocabulary or vocabulary-independent applications. A phonetic transcription system called Tong-yong Phonetic Alphabet (TYPA) is described and used to transcribe the bilingual Mandarin/Taiwanese lexicons. The Right-ContextDependent (RCD) phonetic continuous-density Hidden ...

متن کامل

Large vocabulary taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling

A large vocabulary Taiwanese (Min-nan) speech recognition system is described in this paper. Due to the severe multiple pronunciation phenomenon in Taiwanese partly caused by tone sandhi, a statistical pronunciation modeling technique based on tonal features is used. This system is speaker independent. It was trained by a bi-lingual Mandarin/Taiwanese speech corpus to alleviate the lack of pure...

متن کامل

Toward Constructing A Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin Chinese

The Formosa speech database (ForSDat) is a multilingual speech corpus collected at Chang Gung University and sponsored by the National Science Council of Taiwan. It is expected that a multilingual speech corpus will be collected, covering the three most frequently used languages in Taiwan: Taiwanese (Min-nan), Hakka, and Mandarin. This 3-year project has the goal of collecting a phonetically ab...

متن کامل

A Taiwanese (min-nan) text-to-speech (TTS) system based on automatically generated synthetic units

A Taiwanese (Min-nan) Text-to-Speech (TTS) system has been constructed in this paper based on automatically generated synthetic units by considering several specific phonetic and linguistic characteristics of Taiwanese. Some basic facts about Taiwanese useful in a TTS system is summarized, including the issues of tone sandhi, the writen format and the others. Three functional modules, namely a ...

متن کامل

Modeling Pronunciation Variation for Bi-Lingual Mandarin/Taiwanese Speech Recognition

In this paper, a bi-lingual large vocaburary speech recognition experiment based on the idea of modeling pronunciation variations is described. The two languages under study are Mandarin Chinese and Taiwanese (Min-nan). These two languages are basically mutually unintelligible, and they have many words with the same Chinese characters and the same meanings, although they are pronounced differen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCLCLP

دوره 4  شماره 

صفحات  -

تاریخ انتشار 1999